This document shows a detailed description of the data model that was used during all our experiments with the platforms Kubernetes, MicroK8s and K3s. Furthermore, we show detailed insights for all runs of each platform.
The basis of our data model is a so-called minion. A minion is just a data packet (list of 2 elements) consisting of simulation-related data and system-related metrics like CPU, memory and disk utilization for a certain container orchestration platform. We can obtain a minion by searching for a certain sim_id.
sim_id <- 5000
As an example, have a look at the following function that obtains the data of the simulation with id 5000 from a mongoDB instance.
sample_minion <- get_minion_data(sim_id)
names(sample_minion)
## [1] "simulation_data" "system_metrics"
As mentioned, the data consists of simulation_data and system_metrics. The simulation data (simulation_data) is separated in three tables, simulation, hosts, and events. The next chapter covers some details regarding the simulation data.
| _id | sim_id | platform |
|---|---|---|
| 602a8bec3497ec051e827085 | 5000 | mK8s |
As shown, the simulation data contains a unique identifier (actually the _id assigned by MongoDB), the self assigned sim_id to identify the simulation (besides the id assigned by mongoDB), and the name of the platform, which is subject to simulation.
| _id | simulation_id | hostname | role |
|---|---|---|---|
| 602a8bed3497ec0537bb3c5f | 602a8bec3497ec051e827085 | uniba-dsg-linux0 | controller |
| 602a8bed3497ec05a643072b | 602a8bec3497ec051e827085 | uniba-dsg-h54 | worker |
| 602a8bed3497ec05a4162037 | 602a8bec3497ec051e827085 | uniba-dsg-h12 | master |
| 602a8bed3497ec05a7ca6ceb | 602a8bec3497ec051e827085 | uniba-dsg-h34 | worker |
| 602a8bed3497ec05a8a624f0 | 602a8bec3497ec051e827085 | uniba-dsg-h44 | worker |
The table hosts contains information about all hosts and their roles involved in the experiment. Again, we are storing a unique identifier for the database record, the key of the previously stored simulation to which our hosts belong to, the human-readable name of the different host, and the role of the respective host. The controller node is only stored for the sake of simplicity and is not used in he further experiment.
correlation_id and body was not used by our experiment. Hence the column will be omitted in the furter analysis. Later on, we will conduct a matching between the different events and the the system-related metrics based on timestamps.
The simulation data contains all system-related metrics that are suitable for a performance comparison. For example, we include system-, memory, and disk utilization. All the data is stored by netdata continuously during the uptime of the virtual machines. We chose a sample rate of \(5\,sec.\) for all metrics. Netdata will add a unix timestamp to each record. It records all metrics for all host independently. In the following, we present a short overview of the different system-related metrics.
sample_cpu_metrics <- sample_minion$system_metrics$system_cpu
display.as.table(
sample_cpu_metrics %>%
filter(., timestamp==timestamp[1]) %>%
filter(., hostname==hostname[1])
)
| hostname | chart_name | units | id | value | timestamp |
|---|---|---|---|---|---|
| uniba-dsg-h54 | system.cpu | percentage | guest_nice | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | guest | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | steal | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | softirq | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | irq | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | user | 0.5025125 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | system | 0.1000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | nice | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | iowait | 0.0000000 | 1613401076 |
| uniba-dsg-h54 | system.cpu | percentage | idle | 99.3974880 | 1613401076 |
As shown in the table above, netdata provides a high depth of detail (column id). The CPU utilization is measured in 40 different dimensions. Netdata averages the cpu utilization of all available CPU cores per machine. That means, \(100-idle\) is the total CPU utilization for a certain timestamp (e.g. \(100-99.3974880=0.602512)\). The next table shows the idle utilization of all machines in the experiment for the timestamp 1611083406. We use these values later to obtain the average utilization. Please note that we will apply the previously mentioned formula to obtain CPU utilization.
# processing.R
calculate.cpu.utilization <- function(idle) {
return (round((100-as.numeric(idle)), 4))
}
display.as.table(
sample_cpu_metrics %>%
filter(., timestamp==timestamp[1]) %>%
filter(., id=='idle') %>%
mutate(value, value = calculate.cpu.utilization(value))
)
| hostname | chart_name | units | id | value | timestamp |
|---|---|---|---|---|---|
| uniba-dsg-h54 | system.cpu | percentage | idle | 0.6025 | 1613401076 |
| uniba-dsg-h12 | system.cpu | percentage | idle | 0.7020 | 1613401076 |
| uniba-dsg-h34 | system.cpu | percentage | idle | 0.5000 | 1613401076 |
| uniba-dsg-h44 | system.cpu | percentage | idle | 0.6025 | 1613401076 |
Netdata collects the memory metrics in the same way. The following table shows the available metrics for the memory utilization.
sample_ram_metrics <- sample_minion$system_metrics$system_ram
display.as.table(
sample_ram_metrics %>%
filter(., timestamp==timestamp[1]) %>%
filter(., hostname==hostname[1])
)
| hostname | chart_name | units | id | value | timestamp |
|---|---|---|---|---|---|
| uniba-dsg-h54 | system.ram | MiB | free | 2147.9366 | 1613401076 |
| uniba-dsg-h54 | system.ram | MiB | used | 311.7664 | 1613401076 |
| uniba-dsg-h54 | system.ram | MiB | cached | 1311.4782 | 1613401076 |
| uniba-dsg-h54 | system.ram | MiB | buffers | 165.0063 | 1613401076 |
For our analysis, we obtained the memory utilization by applying the following formula: \((used/total)*100\), e.g., \((348/3936)*100=8.84\,\%\). Again, we will average the results based on timestamps and hosts.
# define total available ram capacity
total_ram_capacity <- 3936
# processing.R
calculate.ram.utilization <- function(value) {
return(round((as.numeric(value) / total_ram_capacity)*100, 4))
}
display.as.table(
sample_ram_metrics %>%
filter(., timestamp==timestamp[1]) %>%
filter(., id=='used') %>%
mutate(value, value = calculate.ram.utilization(value))
)
| hostname | chart_name | units | id | value | timestamp |
|---|---|---|---|---|---|
| uniba-dsg-h54 | system.ram | MiB | used | 7.9209 | 1613401076 |
| uniba-dsg-h12 | system.ram | MiB | used | 7.9960 | 1613401076 |
| uniba-dsg-h34 | system.ram | MiB | used | 7.8723 | 1613401076 |
| uniba-dsg-h44 | system.ram | MiB | used | 8.0573 | 1613401076 |
The metrics disk utilization provides only one single metric. Netdata already provides a calculated value which is defined as the amount time a disk drive spent in input and output operations.
sample_disk_util_metrics <- sample_minion$system_metrics$system_disk_util
display.as.table(
sample_disk_util_metrics %>%
filter(., timestamp==timestamp[1]) %>%
filter(., hostname==hostname[1])
)
| hostname | chart_name | units | id | value | timestamp |
|---|---|---|---|---|---|
| uniba-dsg-h54 | disk_util.sda | % of time working | utilization | 0.3018799 | 1613401076 |
Again, we will mutate the column value according to our definition.
# processing.R
calculate.disk.utilization <- function(value) {
return (round(as.numeric(value), 4))
}
display.as.table(
sample_disk_util_metrics %>%
filter(., timestamp==timestamp[1]) %>%
filter(., id=='utilization')
)
| hostname | chart_name | units | id | value | timestamp |
|---|---|---|---|---|---|
| uniba-dsg-h54 | disk_util.sda | % of time working | utilization | 0.3018799 | 1613401076 |
| uniba-dsg-h12 | disk_util.sda | % of time working | utilization | 0.1972955 | 1613401076 |
| uniba-dsg-h34 | disk_util.sda | % of time working | utilization | 0.2313757 | 1613401076 |
| uniba-dsg-h44 | disk_util.sda | % of time working | utilization | 0.3274032 | 1613401076 |
All metrics are stores with a unix timestamp. For all experiments, the concrete timestamps are not relevant, only the taken amount of time for different steps in the experiment and the experiment in total is important. To improve the readability, we transform all timestamps to a time duration as follows:
# processing.R
calculate.duration <- function(timestamp) {
return(round(timestamp - timestamp[1], 4))
}
display.as.datatable(
sample_minion$simulation_data$events %>%
mutate(., duration = calculate.duration(timestamp)) %>%
select(.,c(-correlation_id, -body))
)
Furthermore, we removed the columns, correlation_id and body because they were not used in the experiment are supposed to be used in experiments in the future.
This leads us to the following function:
# processing.R
process.minion.simuation_data <- function(minion) {
minion$simulation_data$events <- minion$simulation_data$events %>%
mutate(., timestamp = round(timestamp, 4)) %>%
mutate(., duration = calculate.duration(timestamp)) %>%
select(.,c(-correlation_id, -body))
return (minion)
}
display.as.datatable(
process.minion.simuation_data(sample_minion)$simulation_data$events
)
Finally, we provide a function that performs all transformations in one step and allows filtering by certain metric ids as well.
# processing.R
process.minion.system_metric<- function(minion, metric, filter_id, fun) {
minion$system_metrics[[metric]] <- minion$system_metrics[[metric]] %>%
mutate(value, value = fun(minion$system_metrics[[metric]]$value)) %>%
mutate(timestamp, timestamp = calculate.duration(minion$system_metrics[[metric]]$timestamp)) %>%
filter(., id == filter_id)
return (minion)
}
display.as.datatable(
process.minion.system_metric(sample_minion, 'system_cpu', 'idle', calculate.cpu.utilization)$system_metrics$system_cpu
)
The following section describes the basic steps to to process and visualize the raw data in the first place. The following actions do not modify the data (also known as data preprocessing where are lot of values are transformed, changed, or removed), only data reorganization (i.e., move values from A to B) is performed to enable data visualization and analysis.
This section shows the raw data that was obtained during the experiment. In order to show the homogeneity of the data for all runs and all platforms, we present the data without any data processing or aggregation. We will do that platform- and metric-wise.
Firstly, we obtain all necessary data from the database for the later analysis. For this, we select a number of experiments and pass them into a list because all database functions accept a list of arguments. We select the different runs based on the previously introduced sim_ids (see subsection Simulation.
k8s_sim_ids <- c(1000:1026)
k8s_broken_sim_ids <- c(1023, 1025)
k8s_sim_ids <- setdiff(k8s_sim_ids, k8s_broken_sim_ids)
We removed the simulation with the ids 1023, 1025 because of a broken internet connectivity during the experiment.
For K8s, we performed 25 runs. In the next step we obtain all K8s minions from the database.
k8s_minions <- lapply(k8s_sim_ids, get.minion)
In the next step we obtain the the CPU utilization for all K8s simulations. We will use the following helper function (bind.minion.system_metrics) that binds different system_metrics. The function takes a set of minions (e.g., a list) as an argument, the metric to be considered (e.g., system_cpu, system_ram, or system_disk_util), the id (e.g., idle, used, utilization), the label for the y axis, and finally the function to be applied to the column value (e.g., calculate.cpu.utilization, calculate.ram.utilization, calculate.disk.utilization). Furthermore, we transform the timestamps as explained in System metrics. We apply at first the function process.minion.system_metrics. Afterward we add in addition the sim_id and the platform as columns to the table to simplify facetting of many plots. Also, we add the role of a respective node (e.g., ‘worker’ or ‘master’). The function returns as outcome one single table that contains the selected metric for all minions in the given list.
# processing.R
bind.minion.system_metrics <- function(minions, metric, filter_id, fun) {
minions <- lapply(minions, process.minion.system_metric, metric, filter_id, fun)
minions <- Reduce(rbind, lapply(minions, function (minion) {
minion$system_metrics[[metric]]$sim_id <- minion$simulation_data$simulation$sim_id[1]
minion$system_metrics[[metric]]$platform <- minion$simulation_data$simulation$platform[1]
minion$system_metrics[[metric]] <- minion$system_metrics[[metric]] %>%
left_join(select(minion$simulation_data$hosts, hostname, role), by="hostname")
return (minion$system_metrics[[metric]])
}))
return (minions)
}
k8s_bound_system_metrics_cpu <- bind.minion.system_metrics(k8s_minions, 'system_cpu', 'idle', calculate.cpu.utilization)
display.as.datatable(
k8s_bound_system_metrics_cpu %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
Now we can derive the CPU utilization for all simulations (sim_id 1000–1026).
# ggplot.R
plot.minion.bound_system_metrics <- function(bound_system_metrics, y_lab) {
ggplot(bound_system_metrics, aes(x = timestamp, y = value)) + geom_line(aes(color = role)) +
ggtitle(bound_system_metrics$platform,
subtitle = paste0("Simulations ",
unique(bound_system_metrics$sim_id)[1],
" - ",
unique(bound_system_metrics$sim_id)[length(unique(bound_system_metrics$sim_id))],
sep="")) +
xlab("Time (sec.)") +
ylab(y_lab) +
scale_color_hc() +
theme_bw() +
facet_grid(sim_id ~ hostname)
}
plot.minion.bound_system_metrics(k8s_bound_system_metrics_cpu, "CPU util. (%)")
In the next step we provide some basic analysis to assess the between and within statistics. We obtain selected measurements grouped by the hostnames and sim_ids.
# processing.R
summarise.minion.system_metrics <- function(bound_system_metrics, groups) {
bound_system_metrics %>%
select(c(groups, value)) %>%
group_by_at(vars(groups)) %>%
summarise_each(funs(min, max, median, q0.25 = quantile(., probs = .25), q0.75 = quantile(., probs = .75), mean, sd)) %>%
mutate_at(vars(-hostname, -sim_id), round, 4)
}
display.as.datatable(
summarise.minion.system_metrics(k8s_bound_system_metrics_cpu, c("hostname", "sim_id"))
)
For memory utilization, we provide the same steps as for CPU utilization. Firstly, we bind the different system_metrics:
k8s_bound_system_metrics_mem <- bind.minion.system_metrics(k8s_minions, 'system_ram', 'used', calculate.ram.utilization)
display.as.datatable(
k8s_bound_system_metrics_mem %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
Now we can derive the memory utilization for all simulations (sim_id 1000–1026).
plot.minion.bound_system_metrics(k8s_bound_system_metrics_mem, "Memory util. (%)")
In the next step we provide some basic analysis to assess the between and within statistics. We obtain selected measurements grouped by the hostnames and sim_ids.
display.as.datatable(
summarise.minion.system_metrics(k8s_bound_system_metrics_mem, c("hostname", "sim_id"))
)
For the disk utilization, we bind the different system_metrics again:
k8s_bound_system_metrics_disk_util <- bind.minion.system_metrics(k8s_minions, 'system_disk_util', 'utilization', calculate.disk.utilization)
display.as.datatable(
k8s_bound_system_metrics_disk_util %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
Again, we derive the utilization for all simulations (sim_id 1000–1026).
plot.minion.bound_system_metrics(k8s_bound_system_metrics_disk_util, "Disk util. (%)")
Again, we provide some basic analysis to assess the between and within statistics.
display.as.datatable(
summarise.minion.system_metrics(k8s_bound_system_metrics_disk_util, c("hostname", "sim_id"))
)
The description of the raw data is identical for MicroK8s and K3s
mk8s_sim_ids <- c(5000:5025)
mk8s_broken_sim_ids <- c(5013)
mk8s_sim_ids <- setdiff(mk8s_sim_ids, mk8s_broken_sim_ids)
We removed the simulation with the ids 5013 because of a broken internet connectivity during the experiment.
For MicroK8s, we performed 25 runs.
mk8s_minions <- lapply(mk8s_sim_ids, get.minion)
mk8s_bound_system_metrics_cpu <- bind.minion.system_metrics(mk8s_minions, 'system_cpu', 'idle', calculate.cpu.utilization)
display.as.datatable(
mk8s_bound_system_metrics_cpu %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
plot.minion.bound_system_metrics(mk8s_bound_system_metrics_cpu, "CPU util. (%)")
display.as.datatable(
summarise.minion.system_metrics(mk8s_bound_system_metrics_cpu, c("hostname", "sim_id"))
)
mk8s_bound_system_metrics_mem <- bind.minion.system_metrics(mk8s_minions, 'system_ram', 'used', calculate.ram.utilization)
display.as.datatable(
mk8s_bound_system_metrics_mem %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
plot.minion.bound_system_metrics(k8s_bound_system_metrics_mem, "Memory util. (%)")
display.as.datatable(
summarise.minion.system_metrics(k8s_bound_system_metrics_mem, c("hostname", "sim_id"))
)
mk8s_bound_system_metrics_disk_util <- bind.minion.system_metrics(mk8s_minions, 'system_disk_util', 'utilization', calculate.disk.utilization)
display.as.datatable(
mk8s_bound_system_metrics_disk_util %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
plot.minion.bound_system_metrics(mk8s_bound_system_metrics_disk_util, "Disk util. (%)")
display.as.datatable(
summarise.minion.system_metrics(mk8s_bound_system_metrics_disk_util, c("hostname", "sim_id"))
)
k3s_sim_ids <- c(3050:3076)
k3s_broken_sim_ids <- c(3057, 3058)
k3s_sim_ids <- setdiff(k3s_sim_ids, k3s_broken_sim_ids)
We removed the simulation with the ids 3057, 3058 because of a broken internet connectivity during the experiment.
For K3s, we performed 25 runs.
k3s_minions <- lapply(k3s_sim_ids, get.minion)
k3s_bound_system_metrics_cpu <- bind.minion.system_metrics(k3s_minions, 'system_cpu', 'idle', calculate.cpu.utilization)
display.as.datatable(
k3s_bound_system_metrics_cpu %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
plot.minion.bound_system_metrics(k3s_bound_system_metrics_cpu, "CPU util. (%)")
display.as.datatable(
summarise.minion.system_metrics(k3s_bound_system_metrics_cpu, c("hostname", "sim_id"))
)
k3s_bound_system_metrics_mem <- bind.minion.system_metrics(k3s_minions, 'system_ram', 'used', calculate.ram.utilization)
display.as.datatable(
k3s_bound_system_metrics_mem %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
plot.minion.bound_system_metrics(k3s_bound_system_metrics_mem, "Memory util. (%)")
display.as.datatable(
summarise.minion.system_metrics(k3s_bound_system_metrics_mem, c("hostname", "sim_id"))
)
k3s_bound_system_metrics_disk_util <- bind.minion.system_metrics(k3s_minions, 'system_disk_util', 'utilization', calculate.disk.utilization)
display.as.datatable(
k3s_bound_system_metrics_disk_util %>%
filter(., timestamp == 0) %>%
group_by(sim_id) %>%
slice(1)
)
plot.minion.bound_system_metrics(k3s_bound_system_metrics_disk_util, "Disk util. (%)")
display.as.datatable(
summarise.minion.system_metrics(k3s_bound_system_metrics_disk_util, c("hostname", "sim_id"))
)
After identifying potential outliers based on the system metrics, we will have a look at the simulation data. The most important table is the events table. In order to show some basic metrics, we need to bound all metrics into one single table (similar to the system metrics), including some details like the simulation id and the name of the platform (i.e., sim_id, platform). Furthermore, we remove the column hostname because all events are monitored by the controller node.
In order to calculate the length of events, we need to specify a certain set of events. We use the columns event_type and duration to calculate the length. The following table indicates which events are used and how they are calculated:
| Event | Event #1 | Event #2 |
|---|---|---|
| “System idle” | EventType.START_SIMULATION | EventType.START_KNS_PLATFORM |
The following functions helps to obtain a particular pair of timestamps (of durations) for a given event (e.g., event no. 1):
# events.R
calculate.minion.simulation_data.events.timestamps <- function(regexp, events) {
return (events[grepl(regexp, events$event_type), colnames(events) == "duration"])
}
calculate.minion.simulation_data.events.timestamps("EventType.START_SIMULATION|EventType.START_KNS_PLATFORM", process.minion.simuation_data(k8s_minions[[1]])$simulation_data$events)
## [1] 0.0000 300.8376
To provide a better and cleaner derivation of all events of interest, we implemented the following function which creates a new table where all event types are mapped to simulation events (like, EventType.START_SIMULATION and EventType.START_KNS_PLATFORM -> System idle 1). We also include the sim_id and the platform. Only the timestamps are transformed to durations, like before. We store the event_names for further usage:
event_names <- factor(c("System idle 1", "Start master", "Master idle", "Add workers", "Cluster idle", "Apply deployment", "Deployment idle", "Delete deployment", "Drain workers", "Stop master", "System idle 2", "Total simulation time"))
event_names <- factor(event_names, levels = event_names)
# events.R
get.minion.simulation_data.simulation_events <- function(minion) {
minion <- process.minion.simuation_data(minion)
event_types_regexp <- c("EventType.START_SIMULATION|EventType.START_KNS_PLATFORM",
"EventType.START_KNS_PLATFORM|EventType.FINISHED_START_KNS_PLATFORM",
"EventType.FINISHED_START_KNS_PLATFORM|EventType.START_KNS_JOIN_NODES",
"EventType.START_KNS_JOIN_NODES|EventType.FINISHED_START_KNS_JOIN_NODES",
"EventType.FINISHED_START_KNS_JOIN_NODES|EventType.START_KNS_DEPLOYMENT",
"EventType.START_KNS_DEPLOYMENT|EventType.FINISHED_START_KNS_DEPLOYMENT",
"EventType.FINISHED_START_KNS_DEPLOYMENT|EventType.STOP_KNS_DEPLOYMENT",
"EventType.STOP_KNS_DEPLOYMENT|EventType.FINISHED_STOP_KNS_DEPLOYMENT",
"EventType.START_KNS_UNJOIN_NODES|EventType.FINISHED_KNS_UNJOIN_NODES",
"EventType.STOP_KNS_PLATFORM|EventType.FINISHED_STOP_KNS_PLATFORM",
"EventType.FINISHED_STOP_KNS_PLATFORM|EventType.STOP_SIMULATION",
"EventType.START_SIMULATION|EventType.STOP_SIMULATION")
return(
data.frame(
event = rep(event_names, 1, each = 2),
type = rep(c("Start", "End"), 2, each = 1),
duration = unlist(lapply(event_types_regexp, calculate.minion.simulation_data.events.timestamps, minion$simulation_data$events)),
sim_id = rep(minion$simulation_data$simulation$sim_id, length(event_names) * 2),
platform = rep(minion$simulation_data$simulation$platform, length(event_names) * 2)
)
)
}
display.as.datatable(
get.minion.simulation_data.simulation_events(k8s_minions[[1]])
)
The following function moves all calulated simulation events into one single table.
# processing.R
bind.minion.simulation_data.simulation_events <- function(minions) {
return (Reduce(rbind, lapply(minions, get.minion.simulation_data.simulation_events)))
}
display.as.datatable(
k8s_bound_simulation_events <- bind.minion.simulation_data.simulation_events(k8s_minions)
)
The same calculation applies to MicroK8s.
display.as.datatable(
mk8s_bound_simulation_events <- bind.minion.simulation_data.simulation_events(mk8s_minions)
)
display.as.datatable(
k3s_bound_simulation_events <- bind.minion.simulation_data.simulation_events(k3s_minions)
)